The MetaCycle package is mainly used for detecting rhythmic signals from large scale time-series data. It incorporates ARSER(ARS), JTK_CYCLE(JTK), and Lomb-Scargle(LS) properly for periodic signal detection, and could also output integrated analysis results if required.
The usual time-series data is evenly sampled once at each time point, and the interval between neighbour time points is integer. Not all data are as simple as this. There are datasets with replicate samples, or with missing values, or un-evenly sampled, or sampled with a non-integer interval. Examples of these types of data are shown in the below table.
| Data Type | Point 1 | Point 2 | Point 3 | Point 4 | Point 5 | Point 6 |
|---|---|---|---|---|---|---|
| The usual data | CT0 | CT4 | CT8 | CT12 | CT16 | CT20 |
| With missing value | CT0 | NA | CT8 | CT12 | CT16 | CT20 |
| With replicates | CT0 | CT0 | CT8 | CT8 | CT16 | CT16 |
| With un-even interval | CT0 | CT2 | CT8 | CT10 | CT16 | CT20 |
| With non-integer interval | CT0 | CT4.5 | CT9 | CT13.5 | CT18 | CT22.5 |
Of course, some datasets may seem combination of two or more of above types of data.
| Data Type | Point 1 | Point 2 | Point 3 | Point 4 | Point 5 | Point 6 |
|---|---|---|---|---|---|---|
| With replicates and missing value | CT0 | CT0 | CT8 | NA | CT16 | CT16 |
| With un-even interval and replicates | CT0 | CT2 | CT2 | CT10 | CT16 | CT20 |
The meta2d function in MetaCycle is designed to analyze differnt types of time-series datasets, and it could automatically select proper method to analyze different types of input datasets. The implementation strategy used for meta2d is shown in the flow chart.
In addition to selecting proper methods to analyze different kinds of datasets, meta2d could also output integrated results from multiple methods. Detail explaination about integration stragegies is in the vignettes of MetaCycle.
meta2d recalculates the amplitude with following model: \[Y_i = B + TRE*(t_i - \frac{\sum_{i=1}^n t_i}{n}) + A*cos(2*\pi*\frac{t_i - PHA}{PER})\]
The baseline and trend level are explained in the below example.
| Column name | Description | Column name | Description |
|---|---|---|---|
| ARS_pvalue | pvalue from ARS | LS_BH.Q | FDR from LS |
| ARS_BH.Q | FDR from ARS | LS_period | period from LS |
| ARS_period | period from ARS | LS_adjphase | adjusted phase from LS |
| ARS_adjphase | adjusted phase from ARS | LS_amplitude | amplitude from LS |
| ARS_amplitude | amplitude from ARS | meta2d_pvalue | integrated pvalue |
| JTK_pvalue | pvalue from JTK | meta2d_BH.Q | FDR based on integrated pvalue |
| JTK_BH.Q | FDR from JTK | meta2d_period | averaged period |
| JTK_period | period from JTK | meta2d_phase | integrated phase |
| JTK_adjphase | adjusted phase from JTK | meta2d_Base | baseline value given by meta2d |
| JTK_amplitude | amplitude from JTK | meta2d_AMP | amplitude given by meta2d |
| LS_pvalue | pvalue from LS | meta2d_rAMP | relative amplitude |
# load 'shiny' package
library("shiny")
# load 'MetaCycle' package
library("MetaCycle")
# load 'dplyr' package
library("dplyr")
# load 'ggplot2' package
library("ggplot2")
# load 'cowplot' package
library("cowplot")
# change working directory to the desktop
setwd("~/Desktop")
# check required directories under the desktop
dir()
# load shiny package
library("shiny")
# use 'runGitHub' function of shiny package
runGitHub("MetaCycleApp", "gangwug")
# load shiny package
library("shiny")
# use 'runApp' function of shiny package
runApp("~/Desktop/MetaCycleApp-master")
# load dplyr package
library("dplyr")
# read in the 'meta2d_experientA.csv' file
dataD <- read.csv("~/Desktop/SRBR_SMTSAworkshop-master/result/meta2d_experimentA.csv", stringsAsFactors = FALSE)
# take a look at column names of the data
colnames(dataD)
## [1] "CycID" "JTK_pvalue" "JTK_BH.Q" "JTK_period"
## [5] "JTK_adjphase" "JTK_amplitude" "LS_pvalue" "LS_BH.Q"
## [9] "LS_period" "LS_adjphase" "LS_amplitude" "meta2d_pvalue"
## [13] "meta2d_BH.Q" "meta2d_period" "meta2d_phase" "meta2d_Base"
## [17] "meta2d_AMP" "meta2d_rAMP" "X18" "X20"
## [21] "X22" "X24" "X26" "X28"
## [25] "X30" "X32" "X34" "X36"
## [29] "X38" "X40" "X42" "X44"
## [33] "X46" "X48" "X50" "X52"
## [37] "X54" "X56" "X58" "X60"
## [41] "X62" "X64"
# filter the data by meta2d_BH.Q < 0.05 by 'filter' function of dplyr
cirD <- filter(dataD, meta2d_BH.Q < 0.05)
# see the number of circadian transcripts
nrow(cirD)
## [1] 480
# select "CycID", "meta2d_period", "meta2d_phase", and "X18" to "X64" columns for drawing heatmap by 'select' function of dplyr package
figureD <- select(cirD, CycID, meta2d_period, meta2d_phase, num_range("X", seq(18, 64, by=2), width = 2))
# load 'heatmapF' and 'phaseHist' function used to draw heatmap and histogram
source("~/Desktop/SRBR_SMTSAworkshop-master/R/fig.R")
# get the heatmap figure by 'heatmapF' function
heatmapFigure <- heatmapF(inputD = figureD, minfold=0.5, maxfold=2)
heatmapFigure
# get the phase distribtuion figure by 'phaseHist' function
histFigure <- phaseHist(inputD = figureD, binvalue=seq(0,24,by=2), histcol = "blue")
histFigure
# make sure that 'figureD' dataframe is not deleted, and select its 'CycID' and 'meta2d_phase' columns for later analysis
phaseD <- select(figureD, CycID, meta2d_phase)
# if it reports error after typing above command, please re-run the code of preparing 'figureD' in the first part of 3.5 section of this demo
# read in the annotation file-'annoDbiMainInfMouse4302.txt' in the 'data-raw' directory
annoD <- read.delim("~/Desktop/SRBR_SMTSAworkshop-master/data-raw/annoDbiMainInfMouse4302.txt", stringsAsFactors = FALSE)
dim(cirD)
## [1] 480 42
dim(figureD)
## [1] 480 27
# add annotation information with 'inner_join' function of dplyr package
phaseD <- inner_join(figureD, annoD, by=c("CycID" = "PROBEID"))
# take a look at the phaseD
head(phaseD)
## CycID meta2d_period meta2d_phase X18 X20 X22 X24
## 1 1439401_x_at 24 11.74247 3493.5 3249.0 3016.6 3210.0
## 2 1439401_x_at 24 11.74247 3493.5 3249.0 3016.6 3210.0
## 3 1439401_x_at 24 11.74247 3493.5 3249.0 3016.6 3210.0
## 4 1456766_at 24 18.59238 646.8 771.4 400.9 499.7
## 5 1438743_at 24 15.34830 63726.3 62767.9 49438.7 47423.6
## 6 1438743_at 24 15.34830 63726.3 62767.9 49438.7 47423.6
## X26 X28 X30 X32 X34 X36 X38 X40 X42
## 1 3398.6 2444.6 3742.5 4172.3 4878.1 3620.5 4107.8 4078.0 4104.8
## 2 3398.6 2444.6 3742.5 4172.3 4878.1 3620.5 4107.8 4078.0 4104.8
## 3 3398.6 2444.6 3742.5 4172.3 4878.1 3620.5 4107.8 4078.0 4104.8
## 4 419.0 444.2 363.5 302.7 456.3 440.5 442.3 702.9 590.3
## 5 22500.7 29290.3 20974.4 30461.9 91815.1 88426.8 77997.2 62750.3 40474.0
## 6 22500.7 29290.3 20974.4 30461.9 91815.1 88426.8 77997.2 62750.3 40474.0
## X44 X46 X48 X50 X52 X54 X56 X58 X60
## 1 3489.2 3062.5 3600.3 3826.7 3359.3 3203.7 4356.2 4913.6 4686.7
## 2 3489.2 3062.5 3600.3 3826.7 3359.3 3203.7 4356.2 4913.6 4686.7
## 3 3489.2 3062.5 3600.3 3826.7 3359.3 3203.7 4356.2 4913.6 4686.7
## 4 643.1 634.7 451.9 450.0 391.1 344.4 169.7 429.9 465.1
## 5 59169.7 43910.6 64848.9 24069.7 27574.1 17585.7 52368.7 78558.2 59282.3
## 6 59169.7 43910.6 64848.9 24069.7 27574.1 17585.7 52368.7 78558.2 59282.3
## X62 X64 ENTREZID SYMBOL
## 1 4220.8 4081.7 26932 Ppp2r5e
## 2 4220.8 4081.7 26932 Ppp2r5e
## 3 4220.8 4081.7 26932 Ppp2r5e
## 4 804.0 762.8 77609 Ccdc151
## 5 94330.0 65719.0 13122 Cyp7a1
## 6 94330.0 65719.0 13122 Cyp7a1
## GENENAME ENZYME PATH
## 1 protein phosphatase 2, regulatory subunit B', epsilon <NA> 3015
## 2 protein phosphatase 2, regulatory subunit B', epsilon <NA> 4114
## 3 protein phosphatase 2, regulatory subunit B', epsilon <NA> 4310
## 4 coiled-coil domain containing 151 <NA> NA
## 5 cytochrome P450, family 7, subfamily a, polypeptide 1 1.14.13.17 120
## 6 cytochrome P450, family 7, subfamily a, polypeptide 1 1.14.13.17 140
dim(phaseD)
## [1] 885 32
# select "CycID", by 'select' function of dplyr package
#figureD <- select(cirD, CycID, meta2d_period, meta2d_phase, num_range("X", seq(18, 64, by=2), width = 2))